National Repository of Grey Literature 6 records found  Search took 0.00 seconds. 
Text Data Clustering
Leixner, Petr ; Burgetová, Ivana (referee) ; Bartík, Vladimír (advisor)
Process of text data clustering can be used to analysis, navigation and structure large sets of texts or hypertext documents. The basic idea is to group the documents into a set of clusters on the basis of their similarity. The well-known methods of text clustering, however, do not really solve the specific problems of text clustering like high dimensionality of the input data, very large size of the databases and understandability of the cluster description. This work deals with mentioned problems and describes the modern method of text data clustering based on the use of frequent term sets, which tries to solve deficiencies of other clustering methods.
Methods for Information Extraction in Text Documents
Sychra, Tomáš ; Burget, Radek (referee) ; Bartík, Vladimír (advisor)
Knowledge discovery in text documents is part of data mining. However, text documents have different properties in comparison to regular databases. This project contains an overview of methods for knowledge discovery in text documents. The most frequently used task in this area is document classification. Various approaches for text classification will be described. Finally, I will present algorithm Winnow that should perform better than any other algorithm for classification. There is a description of Winnow implementation and an overview of experimental results.
Methodology of non-invasive survey of library units
Vávrová, Petra ; Neoralová, Jitka ; Novotná, Dana ; Součková, Magda ; Kazanskii, Andrei ; Blecha, Tomáš ; Popelková, Daniela ; Kohoutová, Kristina ; Kocour, Vladimír
Cílem předkládané metodiky je představení vybraných technologií a postupů pro průzkum knihovních dokumentů, u kterých je předpokládán výskyt sekundárně použitých textových nosičů (tzv. zlomků), historicky cenných fragmentů poškozených nebo odstraněných textů a provenienčních znaků, vnitřních defektů knižní vazby nebo napadení hmyzem.
Fulltext: Download fulltextPDF
Text Data Clustering
Leixner, Petr ; Burgetová, Ivana (referee) ; Bartík, Vladimír (advisor)
Process of text data clustering can be used to analysis, navigation and structure large sets of texts or hypertext documents. The basic idea is to group the documents into a set of clusters on the basis of their similarity. The well-known methods of text clustering, however, do not really solve the specific problems of text clustering like high dimensionality of the input data, very large size of the databases and understandability of the cluster description. This work deals with mentioned problems and describes the modern method of text data clustering based on the use of frequent term sets, which tries to solve deficiencies of other clustering methods.
Methods for Information Extraction in Text Documents
Sychra, Tomáš ; Burget, Radek (referee) ; Bartík, Vladimír (advisor)
Knowledge discovery in text documents is part of data mining. However, text documents have different properties in comparison to regular databases. This project contains an overview of methods for knowledge discovery in text documents. The most frequently used task in this area is document classification. Various approaches for text classification will be described. Finally, I will present algorithm Winnow that should perform better than any other algorithm for classification. There is a description of Winnow implementation and an overview of experimental results.
Automatizace generování stopslov
Krupník, Jiří
This diploma thesis focuses its point on automatization of stopwords generation as one method of pre-processing a textual documents. It analyses an influence of stopwords removal to a result of data mining tasks (classification and clustering). First the text mining techniques and frequently used algorithms are described. Methods of creating domain specific lists of stopwords are described to detail. In the end the results of large collections of text files testing and implementation methods are presented and discussed.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.